Klee's measure problem

In computational geometry, Klee's measure problem is the problem of determining how efficiently the measure of a union of (multidimensional) rectangular ranges can be computed. Here, a d-dimensional rectangular range is defined to be a cartesian product of d intervals of real numbers, which is a subset of Rd.

The problem is named after Victor Klee, who gave an algorithm for computing the length of a union of intervals (the case d = 1) which was later shown to be optimally efficient in the sense of computational complexity theory. The computational complexity of computing the area of a union of 2-dimensional rectangular ranges is now also known, but the case d ≥ 3 remains an open problem.

Contents

History and algorithms

In 1977, Victor Klee considered the following problem: given a collection of n intervals in the real line, compute the length of their union. He then presented an algorithm to solve this problem with computational complexity (or "running time") O(n \log n) — see Big O notation for the meaning of this statement. This algorithm, based on sorting the intervals, was later shown by Michael Fredman and Bruce Weide (1978) to be optimal.

Later in 1977, Jon Bentley considered a 2-dimensional analogue of this problem: given a collection of n rectangles, find the area of their union. He also obtained a complexity O(n \log n) algorithm, now known as Bentley's algorithm, based on reducing the problem to n 1-dimensional problems: this is done by sweeping a vertical line across the area. Using this method, the area of the union can be computed without explicitly constructing the union itself. Bentley's algorithm is now also known to be optimal (in the 2-dimensional case), and is used in computer graphics, among other areas.

These two problems are the 1- and 2-dimensional cases of a more general question: given a collection of n d-dimensional rectangular ranges, compute the measure of their union. This general problem is Klee's measure problem.

When generalized to the d-dimensional case, Bentley's algorithm has a running time of O(n^{d-1} \log n). This turns out not to be optimal, because it only decomposes the d-dimensional problem into n (d-1)-dimensional problems, and does not further decompose those subproblems. In 1981, Jan van Leeuwen and Derek Wood improved the running time of this algorithm to O(n^{d-1}) for d ≥ 3 by using dynamic quadtrees.

In 1988, Mark Overmars and Chee Yap proposed an O(n^{d/2} \log n) algorithm for d ≥ 3 which is the fastest known algorithm to date. Their algorithm uses a particular data structure similar to a kd-tree to decompose the problem into 2-dimensional components and aggregate those components efficiently; the 2-dimensional problems themselves are solved efficiently using a trellis structure. Although asymptotically faster than Bentley's algorithm, its data structures use significantly more space, so it is only used in problems where either n or d is large. In 1998, Bogdan Chlebus proposed a simpler algorithm with the same asymptotic running time for the common special cases where d is 3 or 4.

Current status

The only known lower bound for any d is \Omega(n \log n). The Overmars–Yap algorithm provides an upper bound of O(n^{d/2} \log n), so for d ≥ 3, it remains an open question whether faster algorithms are possible, or alternatively whether tighter lower bounds can be proven. In particular, it remains open whether the algorithm's running time must depend on d. In addition, the question of whether there are faster algorithms that can deal with special cases (for example, when the input coordinates are integers within a bounded range) remains open.

References and further reading

Important papers

Secondary literature